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The present study investigated tlie feasibility of using self-paced eye 
movements during reading (measured by an eye tracker) as markers 
for calculating hemodynamic brain responses measured by func- 
tional magnetic resonance imaging (fMRI). Specifically, we were in- 
terested in whether the fixation-related fMRI analysis approach was 
sensitive enough to detect activation differences between reading 
material (words and pseudowords) and nonreading material (line and 
unfamiliar Hebrew strings). Reliable reading-related activation was 
identified in left hemisphere superior temporal, middle temporal, and 
occipito-temporal regions including the visual word form area 
(VWFA). The results of the present study are encouraging insofar as 
fixation-related analysis could be used in future fMRI studies to 
clarify some of the inconsistent findings in the literature regarding 
the VWFA. Our study is the first step in investigating specific visual 
word recognition processes during self-paced natural sentence 
reading via simultaneous eye tracking and fMRI, thus aiming at an 
ecologically valid measurement of reading processes. We provided 
the proof of concept and methodological framework for the analysis 
of fixation-related fMRI activation in the domain of reading research. 

Keywords: cerebrum, functional magnetic resonance imaging, language, 
visual word form area, visual word recognition 

Introduction 

Reading is a complex activity involving numerous cognitive 
processes. In the typical everyday natural reading situation, a 
reader silently scans sentences and texts at his or her own 
pace with the intention of extracting information. In contrast, 
the majority of neurocognitive studies on reading suffer from 
an ecologically invalid reading situation. Specifically, in 
typical electroencephalography (EEG), magnetoencephalogra- 
phy (MEG), and functional magnetic resonance imaging 
(fMRI) studies, the participants are presented with isolated 
letter strings (e.g., words, pseudowords) without context and 
asked to perform a more or less artificial task (e.g., reading 
aloud, rhyme judgment, lexical decision). Furthermore, expo- 
sition duration to stimuli is chosen rather arbitrarily and con- 
trolled by the experimenter. However, it is well known that 
the brain response can be altered by exposition time and task 
demands (e.g., via top-down activation) in such a way that it 
no longer reflects the processing of the stimulus material per 
se (e.g., Dehaene and Cohen 2011). 

Apart from single-word presentation, other common para- 
digms lack ecological validity as well. For example, studies 
based on rapid serial visual presentation (RSVP) of words or 



parts of sentences suffer from the problem that the rate of pres- 
entation (i.e., which information is presented and when it is at- 
tended to) is externally determined instead of internally 
controlled (i.e., by the subject). Recently, it has been reported 
that the presentation rate utilized in RSVP experiments can 
substantially affect electrophysiological brain responses (Dam- 
bacher et al. 2012). Hence, these studies are probably limited 
in advancing our understanding of neurocognitive processes 
during natural reading. Rather, they inform us on the neural 
correlates of visual word processing during certain 
reading-related situations or tasks. 

Paradigms in which whole sentences or text passages are 
presented and participants are unconstrained with respect to 
the point of time and target location of their eye movements 
offer more natural reading situations. On the downside, any 
step to improve ecological validity usually implies a loss in 
experimental control. For example, the complex pattern of fix- 
ations during natural reading (e.g., refixations, regressions, 
word skippings, etc.) varies across trials and participants. In 
consequence, the experimental situation is more natural but 
data analyses are potentially less reliable and more difficult to 
interpret. However, in typical fMRI studies sentences (consist- 
ing of several words) are treated as unitary events starting with 
the initial appearance of the text on the screen. Thereby, the 
measured signal represents an average across multiple distinct 
neurocognitive processes during reading. Accordingly, in 
order to investigate processes associated with visual word pro- 
cessing during natural reading, one has to present sentences 
but treat the processing of the different parts of sentences (i.e., 
words) as separate events. 

In the domain of EEG, this was realized by the use of 
fixation-related potentials (FRPs; Hutzler et al. 2007; Dimigen 
et al. 2011). In this approach, an eye tracker is used to measure 
the subject's eye movements (i.e., saccades) during reading 
and the resting periods on words (i.e., fixations) are used as 
markers for calculating electrophysiological brain potentials. 
The reasoning is that the point of time when a word is fixated 
for the first time (compared with the point of time when a 
word appears on the screen) is a more valid indicator for the 
beginning of cognitive processes that depend upon foveal per- 
ception of that word. Note that in natural reading there is also 
a significant amount of parafoveal preprocessing of upcoming, 
not yet fixated words, which — by definition — starts before 
foveal processing (e.g., Rayner 1998). Furthermore, eye move- 
ments represent internally (by the subject) generated shifts of 
attention, whereas in typical paradigms it is determined 
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externally (by the experimenter) when the next stimulus is pre- 
sented (and attended to). 

The feasibility of the FRP approach for the domain of 
reading research has been well established. Prominent EEG 
effects were replicated and extended, for example the old-new 
effect (Hutzler et al. 2007) and the word-predictability effect 
(Dimigen et al. 2011). The benefit of FRPs over traditional 
ERPs is that they allow experimental paradigms resembling 
natural reading conditions. Evidence for differing effects 
between artificial and natural reading situations are only begin- 
ning to emerge, but first studies point to an earlier onset of 
electrophysiological components under natural conditions 
(Dimigen et al. 2011), most likely resulting from parafoveal 
lexical preprocessing (Reingold et al. 2012). This is important 
because the latency of specific effects is one of the most valu- 
able pieces of information to be gained from EEG studies. 

Complementary to EEG, fMRI offers the possibility of study- 
ing human brain function at a high spatial resolution. Recently, 
Marsman et al. (2012) transferred the logic of the FRP ap- 
proach to fMRI. In a visual object processing task, they demon- 
strated the feasibility of using fixations as onsets for calculating 
hemodynamic responses. The success of the fixation-related 
analysis approach is remarkable, given the low temporal resol- 
ution of fMRI compared with EEG. Specifically, concerns 
about using fixations as onsets in the analysis of fMRI data 
relate to two temporal properties of fixations: first, they are 
relatively short (~200-300 ms) and, secondly, they usually 
occur at a relatively high rate (~3-4 per second), only split by 
short saccades (~20-30 ms) during which no visual infor- 
mation is picked up. 

The results of Marsman et al. (2012) are encouraging in such 
a way that the fixation-related analysis approach may also be 
applied to reading material. Specifically, Marsman et al. (2012) 
presented screens with a circular array of visual objects (pic- 
tures of faces and houses) and the participants were free to 
explore these screens, with a memory task following stimulus 
presentation. The eye movement behavior of the participants 
corresponded to typical viewing behavior during reading with 
brief fixations, high frequency of saccades, and a number of re- 
fixations and regressions. Importantly, fixations corresponded 
to an fMRI response reliably enough for identification of acti- 
vation differences between faces and houses. Thus, this study 
provides evidence for the fundamental feasibility of using fix- 
ations as markers for the onset of hemodynamic events. 

The present study investigated whether fixation-related 
fMRI analysis may also be applied to the domain of reading re- 
search. However, before applying this method to natural 
reading situations (i.e., sentence reading), it must be shown 
that analysis of fMRI responses from self-paced fixations is sen- 
sitive enough to identify activation differences between 
reading material (e.g., words) and nonreading material (e.g., 
line strings). In order to provide this proof of concept, we 
adapted the Marsman et al. (2012) study and instead of faces 
and houses we presented familiar letter strings (words), unfa- 
miliar letter string (pseudowords), simple nonletter strings 
(line strings), and complex unfamiliar character strings 
(Hebrew strings). Note that the circular array of stimuli and the 
judgment task used in our study is very different from natural 
reading. The main purpose of our study, however, was to 
investigate whether the fixation-related fMRI approach as 
implemented by Marsman et al. (2012) for visual object proces- 
sing would also be suitable for visual letter string processing. 



We expected to find activation of the typical task-positive 
bilateral network for visual processing of character strings (in- 
dependent of stimulus type). This is important to assure 
general data quality and suitability for further analysis. 
However, the critical expectation referred to the comparison 
between the different stimulus categories. We expected higher 
activation for reading material (words and pseudowords) com- 
pared with nonreading material (line and Hebrew strings) in 
left hemisphere language regions. Specifically, reading-related 
activation was expected in left posterior temporal, left occipito- 
temporal, and left inferior frontal regions. These regions are 
generally accepted as important reading-related regions (see 
reviews by Jobard et al. 2003; Demonet et al. 2004; Schlaggar 
and McCandliss 2007; Shaywitz and Shaywitz 2008; Price 
2012; Richlan 2012). The present study is the first step in the 
application of the fixation-related fMRI approach to the 
domain of reading research. It paves the way for subsequent 
fMRI studies utilizing self-paced natural sentence reading. 



Materials and Methods 

Participants 

Eighteen (13 females, 5 males) German-speaking adults in the age 
range of 17-35 years (M= 24.39 years; SD = 4Al years) participated in 
the study. All participants had normal or corrected-to-normal vision 
and reported no history of neurological disease or reading difficulties. 
None of the participants were familiar with Hebrew characters. Partici- 
pants gave written informed consent and were paid for their partici- 
pation. 



Stimuli and Task 

The stimulus set consisted of 108 words, 108 pseudowords, 108 line 
strings, and 108 Hebrew strings. Each item consisted of 5 characters of 
mono-.spaced Courier New font, with single character widths not ex- 
ceeding 0.3° of visual angle (total item width ~1.7° of visual angle, 
which is equivalent to typical reading situations). As evident from 
Table 1, words (with a log frequency of 0.96; CELEX database; Baayen 
et al. 1993) were matched to pseudowords on the following variables: 
number of syllables, number of Coltheart's orthographic neighbors 
(i.e., same-length words differing by one letter), log frequency of the 
highest frequency neighbor, initial bigram frequency, final bigram fre- 
quency, and summated bigram frequency. Line strings — serving as 
simple visual control stimuli — consisted of forward and backward 
slashes. Hebrew strings — serving as more complex visual control 
stimuli — were built of Hebrew characters, with which our participants 
were unfamiliar. Compared with the reading material (i.e., words and 
pseudoword.s), the line strings had fewer visual features (e.g., line 
length, number of line junctions, number of brushstrokes), while the 
visual characteristics of the Hebrew strings were similar to those of the 
Latin letter strings. All stimuli were presented in the same font size and 
— due to the mono-spaced font — were equal in width. Ten stimuli from 
each of the four categories included a sequence with two identical 
symbols in immediate .succession. 



Table 1 

Means (and standard deviations) of the item clnaracteristics 



Clnaracteristics 


Words 


Pseudowords 


Log frequency 


0.96 (0.52) 




Number of syllables 


1.87 (0.46) 


1.88 (0.45) 


Number of Coltheart's orthographic neighbors 


2.35 (2.02) 


2.33 (1.59) 


Log frequency of the highest frequency neighbor 


1.00 (0.94) 


1.02 (0.93) 


Initial bigram frequency 


437 (364) 


437 1366) 


Final bigram frequency 


1046 (1579) 


1381 (2449) 


Summated bigram frequency 


13475 (6882) 


13428 (6722) 
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Fixation cross: 3000-5000 ms 




Drift correctionivariable duration 



Stimulus screen: 4000-SOOO ms 



Decision: 3000 ms 



Figure 1. Stimuli and task. 



As illustrated in Figure 1 , six stimuli were equidistantly arranged on 
six fixed locations and simultaneously presented via a mirror on an 
MR-compatible LCD screen (NordicNeuroLab, Bergen, Norway). This 
design was borrowed from Marsman et al. (2012). The LCD screen was 
set to a refresh rate of 60 Hz and a resolution of 1024 x 768 pixels. A 
total number of 72 screens (each containing 6 stimuli) were presented 
by the Experiment Builder software (SR-Research Ltd., ON, Canada). 
In addition, 18 null-events were included in which a fixation cross was 
presented in the center of the screen. 

In contrast to the Marsman et al. (2012) study, which used a Stern- 
berg memory task, we chose an implicit reading task with similar at- 
tention and task demands to all of the four stimulus categories: our 
task required the participants to count the number of items with two 
identical consecutive characters. The rationale behind this task was 
that we wanted to ensure comparable processing demands for reading 
material and nonreading material, as well as for words and pseudo- 
words with as little top-down influence as possible. Nevertheless, the 
task should result in implicit, automatic reading of words and pseudo- 
words upon fixation. For example, on a screen with "Komet," "NDSn," 
"Hobby," "/ \ / \ /," "Valer," and "I \ 1 1 \" the correct answer was 
"two" as "Hobby" and \ / / \" contained two identical characters in 
immediate succession. Thirty-six screens contained no such strings, 32 
screens contained one such string, and 4 screens contained 2 such 
strings. Each trial started with a central fixation cross, which was pre- 
sented pseudo-randomly for 3000, 3500, 4000, 4500, or 5000 ms. The 
fixation cross was followed by a drift correction realized by the eye 
tracker (details below). After the drift correction, the stimulus screen 
was presented for 4000, 4500, 5000, 5500, or 6000 ms, depending on 
the duration of the fixation cross (that is, when the fixation cross was 
presented for 3000 ms, the stimulus screen was presented for 6000 
ms, and so on). The reason for the temporal jittering of the fixation 
cross and stimulus screen was to reduce expectancy effects of the par- 
ticipants. In contrast to Marsman et al. (2012), who presented the 
stimuli for up to 18 s, we chose a shorter duration to reduce the 
number of regressions on previously fixated items. Such regressions 
may lead to unwanted repetition priming effects. In summary, the dur- 
ation of the fixation cross and the stimulus screen was 9000 ms plus a 
variable duration resulting from the drift correction. After a stimulus 
screen, a central question mark appeared for 3000 ms and the partici- 
pants were required to provide their response by pressing with their 
thumb one of the three keys on a response pad (corresponding to 0, 
1, and 2 strings containing two identical consecutive characters. 



respectively). Participants were familiarized with the procedure before 
scanning by a training session outside the scanner. 



Data Acquisition and Analysis 

Eye movements were recorded by an Eyelink CL system (SR-Research 
Ltd., ON, Canada) in the long range set up. The camera was mounted 
on the head side of the scanner bore, nearest to the LCD screen. Move- 
ments of the right eye were recorded with a sampling rate of 1000 Hz. 
While recording, the head was stabilized in the head coil ~90 cm away 
from the screen. The eye tracker was calibrated with a 9-point cali- 
bration routine before each run and when the drift correction failed. 
Preceding each trial, a drift correction procedure was used to adapt the 
calibration to minor changes due to drifts. 

Fixations were attributed to an item when they fell within a rec- 
tangle of 200 X 150 pixels around the center of an item (corresponding 
to a width of ~4.5°). Fixations shorter than 80 ms were discarded from 
analysis. 

Functional imaging data were acquired with a Siemens Magnetom 
Trio 3 Tesla scanner (Siemens AG, Erlangen, Germany) equipped with 
a 12-channel head-coU. Functional images sensitive to blood oxygen 
level-dependent (BOLD) contrast were acquired with a r2*-weighted 
gradient echo EPI sequence (TR 2000 ms, TE 30 ms, matrix 64 x 64 mm, 
FOV 192 mm, flip angle 80°). Thirty-six slices with a slice thickness of 
3 mm and a slice gap of 0.3 mm were acquired within the TR. Scanning 
proceeded in 3 runs with a variable number of scans per run. The exact 
number of scans depended on the participants' viewing behavior and 
the calibration procedure, and ranged from 197 to 359 scans (Af=224 
scans, SD=21 scans). In addition to the functional images, a gradient 
echo field map (TR 488 ms, TE 1 = 4.49 ms, TE 2 = 6.95 ms) and a high 
resolution (1 x 1 x 1.2 mm) structural scan with a Ti-weighted MPRAGE 
sequence were acquired from each participant. 

For preprocessing and statistical analysis, SPM8 software was used 
(http://www.fil.ion.ucl.ac.uk/spm/) running in a MATLAB 7.6 environ- 
ment (Mathworks, Inc., Natick, MA, USA). Functional images were cor- 
rected for geometric distortions by use of the FieldMap toolbox, 
realigned and unwarped, and then coregistered to the high-resolution 
structural image. The structural image was normalized to the MNI Tl 
template image, and the resulting parameters were used for normaliza- 
tion of the functional images, which were resampled to isotropic 
3x3x3 mm voxels and smoothed with a 6 mm FWHM Gaussian 
kernel. No slice timing correction was applied. 

Statistical analysis was performed in a two-stage mixed effects 
model. The crucial analysis step of the fixation-related approach was 
realized during the subject-specific first level model specification. In 
contrast to traditional event-related analysis, where the onsets of 
stimuli are modeled, in the fixation-related analysis each first fixation 
on an item was modeled by a canonical hemodynamic response func- 
tion combined with time and dispersion derivatives comprising an in- 
formed basis set. The movement parameters derived from the 
realignment step during preprocessing were modeled as covariates of 
no interest. The functional data in these first level models were high- 
pass filtered with a cut-off of 128 s and corrected for autocorrelation by 
an AR(1) model (Friston et al. 2002). In these first-level models, the 
parameter estimates reflecting signal change for words versus baseline 
(which consisted of the inter-stimulus interval, the null-events, and the 
eye tracker drift correction/recalibration procedure), pseudowords 
versus baseline, line strings versus baseline, and Hebrew strings versus 
baseline were calculated in the context of a GLM (Henson 2004). These 
subject-specific contrast images were used for the second-level random 
effects analysis. 

Activation for each of the four baseline contrasts was examined by 
f-tests thresholded at a voxel-level (height) of 0.005 (uncorrected) 
and a cluster-level (extent) of _P<0.05 (corrected for multiple compari- 
sons using the family- wise error rate). The resulting activation maps 
were combined and used as a mask to search for differences between 
reading material (words and pseudowords) and nonreading material 
(line and Hebrew strings). These analyses were thresholded at the same 
voxel-level and cluster-level threshold used for the baseline contrasts. 
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Results 

Behavioral and Eye Tracking Results 

Task accuracy was close to perfect with on average 96.14% 
correct responses (5Z) = 4.17%). As evident from Table 2, 
Hebrew strings elicited longer looking times and higher 
numbers of fixations and regressions than the other stimulus 
types, i^S(3 > 11.96, Ps< 0.001. In addition, words, com- 
pared with pseudowords, elicited shorter first fixation dur- 
ations, ?(i7) = 2.79, P<0.05, shorter first pass reading times, 
'(17) = 5.38, /*< 0.001, shorter total reading times, f(i7) = 7.l6, 



Table 2 

Means (and standard deviations] of the eye tracking results 





Words 


Pseudowords 


Line strings 


Hebrew strings 


First fixation duration |ms| 


295 (381 


314(55) 


313 (541 


357 (66) 


First pass reading time (ms) 


431 (671 


465 (75) 


457 (103) 


625 (141) 


Total reading time |ms) 


532 (611 


596 (69) 


571 (116) 


891 (179) 


First pass number of fixations 


1.48 (0.26) 


1.50 (0.28) 


1.51 (0.25) 


1.83 (0.39) 


Total number of fixations 


1.87 (0.29) 


1.96 (0.33) 


1.94 (0.33) 


2.67 (0.47) 


Number of regressions 


0.27 (0.10) 


0.30 (0.13) 


0.29 (0.11) 


0.45 (0.19) 




P< 0.001, and fewer total number of fixations, Wilcoxon 

ir= 2.11, p< 0.05. 

fMRIResults 

As shown in Figure 2, each of the four baseline contrasts 
(words > fixation, pseudowords > fixation, line strings > fix- 
ation, Hebrew strings > fixation) resulted in activation of a 
similar bilateral task-positive network. This network included 
bilateral occipital regions extending ventrally in posterior tem- 
poral regions (inferior, middle, and superior temporal gyri), 
and dorsaUy in superior parietal and postcentral regions. Fur- 
thermore, activation was identified in bilateral precentral, 
inferior temporal, middle temporal, and supplementary motor 
regions, as well as in the cerebellum and in subcortical regions 
(putamen, pallidum, caudate nuclei, thalamus, and middle cin- 
gulum). Words and pseudowords showed a slight left-lateraUza- 
tion (especially in temporal regions) whereas line and Hebrew 
strings showed higher bilateral occipito-parietal activation. 

The results from the four separate baseline contrasts were 
inclusively combined in a disjunction mask which was used to 
search for differences between the stimulus categories. That is, 
the analysis was restricted to voxels which showed reliable 
activation in at least one of the four baseline contrasts. 




Words > Fixation 



Pseudowords > Fixation 



Line strings > Fixation 



Hebrew strings > Fixation 



Figure 2. Surface rendering of the baseline contrasts (left, right, and ventral views with the cerebellum removed, respectively). 
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Table 3 

Regions with higlner activation for reading material (words and pseudowords) compared witin 
nonreading material (line and Hebrew stringsl 



Region 


MNI coordinates 
X K 


1 


Z 


Extent (voxels) 


Words > line strings 












L posterior MTG 


-57 


-55 


4 


4.37 


118 


LSTS 


-51 


-34 


10 


3.77 




LMTG 


-48 


-49 


4 


3.34 




LSTG 


-48 


-16 


-16 


3.22 




Words > Hebrew strings 












L posterior MTG 


-60 


-58 


7 


4.60 


124 


LSMG 


-60 


-37 


25 


4.15 




LSTS 


-51 


-37 


7 


4.11 




LSTG 


-48 


-46 


19 


3.54 




Pseudowords > line strings 












L anterior OTS 


-42 


-43 


-11 


3.83 


108 


LMTG 


-48 


-49 


4 


3.73 




LSTS 


-57 


-37 


4 


3.51 




L posterior MTG 


-57 


-58 


4 


3.45 




Pseudowords > Hebrew strings 










LSTS 


-66 


-34 


7 


4.09 


118 


L posterior MTG 


-60 


-55 


7 


3.95 




L anterior OTS 


-45 


-43 


-14 


3.84 




LMTG 


-54 


-37 


4 


3.56 





L, left; MTG, middle temporal gyrus; OTS, occipito-temporal sulcus; SMG, supramarginal gyrus; 
STG, superior temporal gyrus; STS, superior temporal sulcus. 



Figure 3. Surface rendering of the differences between stimulus types. 



Figure 3 and Table 3 show the regions with higher activation 
for reading material (words and pseudowords) compared with 
nonreading material (line and Hebrew strings). As expected, 
reading material compared with nonreading material led to 
higher activation in left posterior temporal regions. Specifically, 
words, compared with Mne strings, showed higher activation in 
the left posterior middle temporal gyrus (MTG) and in the left 
superior temporal sulcus (STS). Compared with Hebrew strings, 
words showed higher activation in similar regions, with an 
additional small extension in the left supramarginal gyrus (SMG). 

The activation pattern iox pseudowords was slightly differ- 
ent. Compared with line strings, the maximum of the activation 
difference was not located in middle temporal regions but in 
the left occipito-temporal sulcus. However, submaxima were 
similar to those for words and located in the left MTG and in 
the left STS. Pseudowords compared with Hebrew strings led 
to higher activation in similar coordinates, but the maximum 
was not located in the occipito-temporal sulcus (OTS) but in 
the left STS. 

Unexpectedly, our analysis failed to identify higher activation 
for reading material compared with non-reading material in left 
inferior frontal language regions. However, after omitting the 
cluster-level correction, reading-related activation was identified 
in opercular and triangular parts of the left inferior frontal gyrus 
(IFG), as well as in the left insula. This tendency was also 
evident from stronger and more extended activation for words 
and pseudowords compared with line and Hebrew strings illus- 
trated in Figure 2. 

The finding of higher activation for pseudowords (but not 
for words) compared with non-reading material in the left OTS 
was of specific interest. For closer inspection of this effect, we 
directly compared activation elicited by pseudowords to acti- 
vation elicited by words. This analysis revealed a tendency for 
higher activation for pseudowords in an anterior aspect of the 
left OTS (only statistically significant without the cluster-level 
correction). To investigate the activation pattern in the OTS 



more precisely, we conducted a region of interest (ROI) analy- 
sis. This analysis was focused on the ventral visual stream of 
both hemispheres and was similar to ROI analyses from recent 
fMRI studies on visual word recognition (Brem et al. 2006; 
Vinckier et al. 2007; Van der Mark et al. 2009, 2011; IJichlan 
et al. 2010; Schurz et al. 2010; Szwed et al. 2011). Data repre- 
senting signal change estimates (in arbitrary units) for each of 
the four stimulus categories versus baseline were extracted from 
four left hemisphere spheres (6 mm radius) and from four hom- 
ologue right hemisphere spheres. Along the j-axis (anterior to 
posterior gradient), the centers of the spheres were equidistantly 
spaced by 12 mm so that the spheres did not overlap each other. 
The location of the spheres and the results of the ROI analyses 
are provided in Figure 4. Differences between stimulus cat- 
egories were analyzed by ANOVAs and significant post hoc pair- 
wise comparisons are indicated by asterisks. 

The only region which showed a lexicality effect, that is, 
higher activation for pseudowords compared with words, was 
the most anterior ROI in the left OTS around x= —42, jv = —40, 
2r=— 14. In addition, this region showed substantially higher 
activation for reading material compared with nonreading 
material. The slightly posterior left ROI around x=— 42, 
j = — 52, 2r=— 14, also showed significantly higher activation 
for words and pseudowords compared with line strings. 
However, there was no difference compared with the visually 
more letter-like Hebrew strings. In the two most posterior left 
ROIs, the Hebrew strings exhibited the highest activation 
levels, followed by the line strings. The same pattern was ob- 
served for the right hemisphere ROIs. None of the right hemi- 
sphere regions showed higher activation for words or 
pseudowords compared with line or Hebrew strings. 

Discussion 

The present study investigated the feasibility of fixation-related 
analysis of fMRI data for the domain of reading research. 
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-42 -40 -14 



42 -40 -14 




W PW LS HS W PW LS 



Figure 4. ROI-based analysis of the ventral visual stream. Bar plots represent signal change estimates (in arbitrary units) and standard errors of the mean (SEM). Statistically 
significant post hoc pairwise comparisons are indicated by asterisks: *P < 0.05, * P < 0.01, ***P < 0.001. W, words; PW, pseudowords; LS, line strings; HS, Hebrew strings. 



Before applying the approach to natural reading (i.e., in the 
context of sentences), we provided the proof of concept that 
analysis of fMRI responses from self-paced fixations is sensitive 
enough to identify activation differences between reading 
material and nonreading material. This constitutes the prere- 
quisite for subsequent fMRI studies utilizing self-paced natural 
sentence reading. In the following we will discuss the fMRI 
findings as well as some methodological considerations and 
challenges for future studies. 

Baseline Contrasts 

As expected, fixation of each of the four stimulus types (words, 
pseudowords, line strings, Hebrew strings) resulted in activation 



of a similar bilateral network including occipital, temporal, 
parietal, frontal, cerebellar, and subcortical regions. This 
network is typically associated with task-positive activation (Fox 
et al. 2005; Fox and Raichle 2007; Power et al. 2011) and is 
thought to underlie perceptual processes, which operate on ex- 
ternal information, as is the case during visual string processing 
(Binder et al. 2009)- Activation of this network was an important 
indicator for the quality of our data, as this finding replicates 
findings from previous fMRI studies with reading material, 
which used traditional analysis (e.g., Carreiras et al. 2007; 
Fiebach et al. 2007; Jobard et al. 2007; Cohen et al. 2008; Kron- 
bichler et al. 2009; Van der Mark et al. 2009; Schurz et al. 2010; 
Twomey et al. 2011). We demonstrated that fixation-related fMRI 
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analysis worked on a fundamental level (i.e., on baseline con- 
trasts) in the present dataset. This was an important prerequisite 
for further comparisons between stimulus categories. 

Differences Between Reading Material and Nonreadtng 
Material 

The critical expectation referred to the comparisons between 
reading material (words and pseudowords) and nonreading 
material (line and Hebrew strings). We expected higher acti- 
vation for reading material in three regions typically associated 
with reading or reading-related tasks. Specifically, these were 
the left posterior temporal region, the left ventral occipito- 
temporal region, and the left inferior frontal region. Numerous 
functional neuroimaging studies have identified these regions 
as core regions for reading and visual letter string processing 
(see reviews by Jobard et al. 2003; Demonet et al. 2004; Schlag- 
gar and McCandliss 2007; Shaywitz and Shaywitz 2008; Price 
2012; Richlan 2012). Although we used an implicit reading 
task, we expected to trigger automatic reading of words and 
pseudowords upon fixation. 

In line with our expectations, we found higher activation 
for words compared with either type of nonreading strings 
(line and Hebrew strings) in left posterior middle and 
superior temporal regions. For pseudowords compared 
with nonreading strings, both left posterior middle/superior 
temporal regions and left ventral occipito-temporal regions 
were activated. 

Notably, in left inferior frontal language regions, activation 
differences between reading material and nonreading material 
were identified only without the cluster-level correction. Small 
clusters were found in opercular and triangular parts of the left 
IFG, as well as in the left insula. The consistency with findings 
from previous studies using explicit reading tasks shows that — 
although it was not necessary to read the letter strings to 
perform the task — our implicit reading task resulted in reliable 
automatic word and pseudoword processing. 

Left Middle and Superior Temporal Regions 

The most consistent activation across ail four comparisons 
(words > line strings, words > Hebrew strings, pseudowords > 
line strings, pseudowords > Hebrew strings) was found in the 
left posterior MTG. This region is typically associated with se- 
mantic processing and its important role in reading has exten- 
sively been documented not only by fMRI studies (e.g., Jobard 
et al. 2003; Vigneau et al. 2006; Binder and Desai 2011; Price 
2012) but also by EEG and MEG studies (e.g., Simos et al. 
2002; Lau et al. 2008; Vartiainen et al. 2011). 

In addition to the left posterior MTG, the present study 
identified higher activation for reading material (words and 
pseudowords) compared with nonreading material (line and 
Hebrew strings) in more dorsal regions including the left STS, 
the left superior temporal gyrus (STG), and an inferior part of 
the left SMG. Taken together, these regions, classically de- 
scribed as Wernicke's area, are thought to play a central role in 
the integration of auditory and visual information (e.g., van At- 
teveldt et al. 2004). They are involved in both the perception 
and production of speech (Hickok and Poeppel 2007; Price, 
2012). During reading, their main function is related to gra- 
pheme-phoneme conversion (Jobard et al. 2003; Vigneau 
et al. 2006). The identification of reliable activation in these 
well-established reading-related regions (middle and superior 



temporal regions) is an important manipulation check, which 
is necessary for demonstrating the validity of the fixation- 
related fMRI approach for reading research. 

Left Occipito-Temporal Regions 

The whole-brain analysis identified activation in the left OTS 
only for pseudowords compared with nonreading material, 
but not for words compared with nonreading material. The 
direct comparison between words and pseudowords was only 
statistically significant without the cluster-level correction. This 
analysis identified a small cluster with higher activation for 
pseudowords compared with words in an anterior portion of 
the left OTS. The left OTS, which is located between the fusi- 
form gyrus and the inferior temporal gyrus, is considered one 
of the core parts of the reading network. Therefore, we investi- 
gated the activation pattern in this region more precisely. 
Specifically, in order to provide further evidence for the plausi- 
bility of the present results using the fixation-related analysis 
approach, we conducted an ROI analysis based on coordinates 
from the literature (Brem et al. 2006; Vinckier et al. 2007; Van 
der Mark et al. 2009, 2011; Richlan et al. 2010; Schurz et al. 
2010; Szwedetal. 2011). 

Higher activation for pseudowords compared with words 
was limited to the most anterior ROI in the left hemisphere. In 
addition, as expected from its functional role in reading, this 
region also showed higher activation for reading compared 
with nonreading material. The location of the ROI corresponds 
to the most anterior part of the visual word form system 
(Vinckier et al. 2007; Van der Mark et al. 2009; Richlan et al. 

2010) , whose exact function is still the subject of considerable 
debate (e.g., Dehaene and Cohen 2011; Price and Devlin 

2011) . The most anterior part of this system was proposed to 
be involved in the processing of morphemes and short whole 
words (Kronbichler et al. 2004, 2007; Dehaene et al. 2005; 
Schurz et al. 2010). It was supposed to play a role in lexico- 
semantic reading (Price 2012) and is situated just posterior to 
the basal temporal language area associated with heteromodal 
semantic processing 0obard et al. 2003; Binder and Desai 
2011). 

Higher activation for words and pseudowords compared 
with line strings was also found in the slightly posterior left 
ROI, which corresponds to the more classical location of the 
visual word form area (VWFA) (Cohen et al. 2000; Dehaene 
et al. 2002; Jobard et al. 2003). In line with the original formu- 
lation of the functional role of the VWFA, that is, a role in pre- 
lexical processing during reading, we found no difference 
between words and pseudowords but higher activation for 
these letter strings compared with simple line strings. The high 
activation level for Hebrew strings may have resulted from 
their increased visual complexity and their more letter-like ap- 
pearance relative to line strings. 

The high processing demands (as evidenced by the eye 
tracking findings) may also be the reason for the high acti- 
vation levels elicited by Hebrew strings in the two most pos- 
terior left ROIs. The right hemisphere ROIs showed similar 
activation patterns with highest activation levels for line and 
Hebrew strings throughout the ventral stream. This finding is 
in line with the notion of engagement of the right ventral 
visual cortex in the processing of non-linguistic stimuli (e.g., 
Kanwisher 2010). In summary, the results of the ROI analysis 
are largely in Une with findings from the literature, thus 



Cerebral Cortex October 2014, V 24 N 10 2653 



providing further evidence for the feasibility of the fixation- 
related analysis approach. 

The present study was not explicitly designed to investigate 
lexicality effects in the left OTS and probably lacked both a 
task emphasizing such effects and statistical power to identify 
reliable differences using a corrected threshold on the whole- 
brain level. However, as will be explicated in the following, the 
fixation-related analysis approach may be a promising method 
to clarify some of the inconsistent findings in the literature re- 
garding lexicality effects in the left OTS. To illustrate, previous 
studies found higher activation for pseudowords compared 
with words (e.g., Mechelli et al. 2003; Kronbichler et al. 2004, 
2007, 2009; Bruno et al. 2008), no activation differences 
between words and pseudowords (e.g., Dehaene et al. 2002; 
Carreiras et al. 2007; Vinckier et al. 2007), and even higher acti- 
vation for words compared with pseudowords (Fiebach et al. 
2002; Diaz and McCarthy. 2007). 

The present study is the first step in utilizing self-paced fix- 
ations during reading as markers for calculating hemody- 
namic responses of the VWFA. Future fMRI studies using this 
approach may enable novel, more natural reading situations, 
thereby avoiding some of the methodological problems of 
previous studies (e.g., uncontrolled top-down influences on 
activation, artificial tasks, long presentation times), which 
are likely to be the underlying cause for the inconsistent find- 
ings regarding the VWFA. As reviewed by Dehaene and 
Cohen (2011), the type of in-scanner task used in an ex- 
periment can massively influence activation patterns. The 
same is true for unnaturally long or short exposition times to 
stimuli. 

A further benefit of the combined recording of eye tracking 
and fMRI data is that it is no longer necessary to rely on a task 
(e.g., one-back, target detection) to assure that the participants 
attend to the stimulus material. Instead, eye movement 
measures can directly be used to infer participants' attention. 
This can be utilized in order to realize silent reading paradigms 
during scanning. Such paradigms used to be problematic 
because without eye tracking the experimenter had no means 
of observing the participants' behavior during scanning. 

Left Inferior Frontal Regions 

At first sight, it is quite astonishing that the present study ident- 
ified the left IFG in the comparison between reading material 
(words and pseudowords) and non-reading material (line and 
Hebrew strings) only after omitting the cluster-level correction. 
However, although activation of the left IFG is typically found 
in fMRI reading studies, it is still unclear how specific this 
region is related to reading processes per se. While some 
authors argue for a specific role in reading, for example related 
to grapheme-phoneme conversion (e.g., Jobard et al. 2003) or 
lexical access (e.g., Fiebach et al. 2002; Heim et al. in press), 
other findings point to a number of different processes sup- 
ported by the IFG, for example, different language processes 
related to speech planning and comprehension (e.g.. Price 
2012), semantics (e.g., Binder and Desai 2011), executive func- 
tions, affective, and interoceptive processes including working 
memory, reasoning, decision-making, inhibition, attention, 
and emotion (Laird et al. 2011). 

The use of an implicit reading task, which did not demand 
excessive phonological or lexical processing of the words and 
pseudowords, may have been responsible for the weak left 



IFG difference between reading material and nonreading 
material in the present study. Furthermore, by use of a delayed 
response paradigm, stimulus processing and decision-making 
were temporally decoupled, which may have suppressed 
response-related decision processes during fixation. Likewise, 
the high rate (i.e., fast succession) of fixations may have left 
only little time for later, higher order top-down processes. 
Therefore, it is plausible to assume that the fixation-related 
fMRI signal is primarily driven by earlier, bottom-up processes. 
However, this interpretation is highly speculative and certainly 
requires further investigation. 

Methodological Considerations 

We could replicate the successful use of fixations as markers for 
calculating hemodynamic brain responses developed by 
Marsman et al. (2012) and we were able to extend the approach 
to the domain of visual letter string processing. At first sight, 
given the temporal properties of the fMRI signal (imprecise and 
sluggish), it is quite astonishing that fixation-related analysis of 
fMRI data does not only work for categories that have well- 
established vastly distinct neural signatures (faces and houses), 
but also for categories that differ more subtly (visual character 
strings). Specifically, we showed that the initial fixations of 
items (about 300 ms) were sufficiently long to indicate reliable 
fMRI responses. Moreover, although fixations occurred at a 
high rate (about 3 per second), responses corresponding to 
these fixations could be temporally separated. 

The first finding — regarding brief exposure time to stimuli — is 
not surprising, as studies employing rapid event-related designs 
often use similar or even shorter presentation times (e.g., Wheat- 
ley et al. 2005; Yarkoni et al. 2008). Even extremely short presen- 
tation of letter strings (<50 ms and masked) has been shown to 
result in reliable fMRI responses (e.g., Dehaene et al. 1998, 2001; 
Diaz and McCarthy. 2007; Nakamura et al. 2007). Therefore, the 
time spent on looking at an item in the present study is clearly 
sufficient to elicit reliable fMRI responses. 

The latter issue — temporal separation of rapidly succeeding 
fixations — deserves closer attention. It has been shown that de- 
tectability of individual responses in a time course of overlap- 
ping signals crucially depends on temporal randomness of 
individual events. Specifically, Dale (1999) found that while 
statistical efficiency of event-related fMRI designs drops dra- 
matically as inter-stimulus intervals get shorter, this circum- 
stance can be avoided by using temporal jittering (i.e., 
randomly spaced trials). Normally, temporal jittering is con- 
trolled for during the design of rapid event-related fMRI exper- 
iments. In the case of fixation-related fMRI experiments, 
temporal jittering cannot be planned but rather results from 
the viewing behavior of the participants. However, it is plaus- 
ible to assume (and confirmed by our results) that onset times 
of fixations are sufficiently random to achieve reasonable stat- 
istical efficiency. 

A difference between the pioneering study by Marsman 
et al. (2012) and our study was that while Marsman et al. 
(2012) based their analysis strategy exclusively on ROIs, we 
searched the whole brain for reliable activation differences 
between stimulus categories. Whereas the ROI-based approach 
facilitates statistical testing by reducing the multiple compari- 
sons problem, it is vulnerable to missing potentially interesting 
activation in regions, which were not analyzed. In contrast, an 
unconstrained whole-brain search strategy has the potential to 
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discover activation differences throughout the whole brain. It 
is not limited by a priori assumptions and may lead to novel, 
unexpected findings. On the downside, one has to deal with 
the problem of computing numerous statistical tests (about 
45 000 in our case) and to adapt the statistical thresholds for 
the number of these tests. Therefore, high statistical power is 
necessary in order to detect reliable effects. However, as 
evident from our results, the effects we were interested in 
(reading material vs. nonreading material) were robust enough 
to survive even thresholds corrected for multiple comparisons. 
This is particularly impressive given that, up to the present 
study, the feasibility of the fixation-related fMRI approach for 
the domain of reading research was unclear. We assume that 
future experiments can be optimized in a way to even detect 
more subtle effects (e.g., lexicality effects, word frequency 
effects, semantic effects) using corrected tests on the whole- 
brain level. 

The next step would be to apply the fixation-related fMRI 
approach to more natural reading situations. For such an enter- 
prise, one would abandon the circular array of stimuli and 
would rather present lists of words or even whole sentences in 
a line during a silent reading task. This would lead to a smaller 
horizontal distance between the items (hence shorter sac- 
cades) and (due to parafoveal preprocessing) would result in 
more fluent processing, expressed by shorter fixation dur- 
ations. A crucial feature of the present approach, which allowed 
separation of overlapping fixation-related fMRI responses, was 
the temporal jittering of fixations resulting from the eye move- 
ment behavior of the participants. It is unclear whether in a 
sentence reading task onset times of fixations would be suffi- 
ciently random to achieve reasonable statistical efficiency. Fur- 
thermore, a line array would allow parafoveal preprocessing of 
upcoming, not yet fixated words. Up to now, it is an open 
question whether the fixation-related fMRI approach can be 
realized in the face of the complex interplay of foveal and par- 
afoveal processes and the faster timing of a natural reading 
situation. As already mentioned in the Introduction section, a 
further challenge for studies attempting to employ ecologically 
valid self-paced reading paradigms is a certain loss of exper- 
imental control. Specifically, there would be more variability 
across trials and participants with respect to eye movement be- 
havior (e.g., number of fixations, refixations, regressions, 
word skippings, etc.) compared with tightiy controlled, rather 
artificial experimental situations. It is up to future studies to 
deal with these challenges and to refine and advance the 
fixation-related fMRI approach where necessary to allow the 
investigation of specific visual word recognition processes 
during self-paced natural sentence reading. 

Conclusion 

The present study showed the feasibility of fixation-related 
fMRI analysis for the domain of reading research. Using self- 
paced eye movements as markers for hemodynamic brain 
responses, we found reliable reading-related activation in 
important left hemisphere reading regions. Specifically, statisti- 
cal power was large enough to identify reliable differences 
between reading material and nonreading material on the 
whole-brain level using thresholds corrected for multiple com- 
parisons. Additional ROI analyses showed the sensitivity of the 
present approach to detect even more subtle (i.e., lexicality) 
effects in the left occipito-temporal cortex. We provided the 



proof of concept and analysis framework for future combined 
eye tracking and fMRI studies on reading using the fixation- 
related analysis approach. This approach may enable the inves- 
tigation of specific visual word recognition processes during 
self-paced natural sentence reading (e.g., parafoveal prepro- 
cessing), which were previously inaccessible with fMRI. 
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